Overview

Dataset statistics

Number of variables13
Number of observations266691
Missing cells13199
Missing cells (%)0.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory72.4 MiB
Average record size in memory284.8 B

Variable types

NUM10
CAT3

Reproduction

Analysis started2020-05-26 08:26:39.785782
Analysis finished2020-05-26 08:27:18.083411
Duration38.3 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

brand has constant value "Globe Postpaid" Constant
cellId has a high cardinality: 266690 distinct values High cardinality
unitId has a high cardinality: 9534 distinct values High cardinality
revenueData is highly correlated with data and 2 other fieldsHigh correlation
data is highly correlated with revenueData and 1 other fieldsHigh correlation
sms is highly correlated with revenueData and 1 other fieldsHigh correlation
revenueVoice is highly correlated with voice and 1 other fieldsHigh correlation
voice is highly correlated with revenueVoice and 1 other fieldsHigh correlation
revenueSms is highly correlated with data and 4 other fieldsHigh correlation
unitId has 13198 (4.9%) missing values Missing
data is highly skewed (γ1 = 507.4377743) Skewed
voice is highly skewed (γ1 = 344.228712) Skewed
sms is highly skewed (γ1 = 393.2152103) Skewed
revenueData is highly skewed (γ1 = 509.488547) Skewed
revenueVoice is highly skewed (γ1 = 346.2313916) Skewed
revenueSms is highly skewed (γ1 = 467.6719718) Skewed
yieldData is highly skewed (γ1 = 500.3643936) Skewed
yieldSms is highly skewed (γ1 = 33.91918453) Skewed
dataScaleFactor is highly skewed (γ1 = 505.5685903) Skewed
cellId is uniformly distributed Uniform
voice has 124571 (46.7%) zeros Zeros
sms has 124111 (46.5%) zeros Zeros
revenueVoice has 124643 (46.7%) zeros Zeros
revenueSms has 124168 (46.6%) zeros Zeros
yieldVoice has 124643 (46.7%) zeros Zeros
yieldSms has 124168 (46.6%) zeros Zeros

Variables

cellId
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count266690
Unique (%)100.0%
Missing1
Missing (%)< 0.1%
Memory size2.0 MiB
STALUCIAJ-I_4RFS
 
1
VIRRAMALJ-I_4RFS
 
1
CARCBUZ-RA
 
1
ALFONSO2Z-B
 
1
ALAEZ-B
 
1
Other values (266685)
266685
ValueCountFrequency (%) 
STALUCIAJ-I_4RFS1< 0.1%
 
VIRRAMALJ-I_4RFS1< 0.1%
 
CARCBUZ-RA1< 0.1%
 
ALFONSO2Z-B1< 0.1%
 
ALAEZ-B1< 0.1%
 
LEONRDOSTJ-41< 0.1%
 
LORENZOTANL-1721< 0.1%
 
SFLU4H-411< 0.1%
 
SNICOLPAMH-421< 0.1%
 
SAMPALOCLAGH-431< 0.1%
 
ALTVISH-521< 0.1%
 
SMCSJOSERZLH-431< 0.1%
 
KIAMBAH-321< 0.1%
 
GTZAMBOFOCIDZ-A1< 0.1%
 
MAMBL2H-521< 0.1%
 
TOWER2ONEF-111< 0.1%
 
PBURGOSZ-L21< 0.1%
 
CRISTOBALK-1431< 0.1%
 
WASHINK-1311< 0.1%
 
FAIRVIEWF-11< 0.1%
 
TAYTAYCENK-1311< 0.1%
 
MATALATALAF-131< 0.1%
 
TIBUNGF-121< 0.1%
 
OZAMIS-11< 0.1%
 
NEPODAGUPIDJ-C_4RFS1< 0.1%
 
Other values (266665)266665> 99.9%
 

Length

Max length36
Median length12
Mean length12.87280786
Min length3

Overview of Unicode Properties

Unique unicode characters46
Unique unicode categories (?)6
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
A37927611.0%
 
-2666907.8%
 
N1798665.2%
 
L1794575.2%
 
S1719325.0%
 
R1702355.0%
 
11390654.1%
 
I1368834.0%
 
O1368404.0%
 
C1257533.7%
 
T1216223.5%
 
B1119133.3%
 
M1076113.1%
 
E1033173.0%
 
F1011402.9%
 
G954272.8%
 
3925442.7%
 
2859192.5%
 
Z856612.5%
 
U826362.4%
 
P743442.2%
 
D717262.1%
 
H692522.0%
 
4629941.8%
 
J552271.6%
 
Other values (21)2257326.6%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter271333679.0%
 
Decimal Number41641512.1%
 
Dash Punctuation2666907.8%
 
Connector Punctuation365741.1%
 
Lowercase Letter32< 0.1%
 
Space Separator15< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A37927614.0%
 
N1798666.6%
 
L1794576.6%
 
S1719326.3%
 
R1702356.3%
 
I1368835.0%
 
O1368405.0%
 
C1257534.6%
 
T1216224.5%
 
B1119134.1%
 
M1076114.0%
 
E1033173.8%
 
F1011403.7%
 
G954273.5%
 
Z856613.2%
 
U826363.0%
 
P743442.7%
 
D717262.6%
 
H692522.6%
 
J552272.0%
 
K390161.4%
 
V357841.3%
 
Y346681.3%
 
W193620.7%
 
Q130740.5%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-266690100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
113906533.4%
 
39254422.2%
 
28591920.6%
 
46299415.1%
 
7165054.0%
 
672501.7%
 
062341.5%
 
543481.0%
 
89060.2%
 
96500.2%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_36574100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
15100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
l825.0%
 
s618.8%
 
n515.6%
 
a412.5%
 
c39.4%
 
r39.4%
 
o39.4%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin271336879.0%
 
Common71969421.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A37927614.0%
 
N1798666.6%
 
L1794576.6%
 
S1719326.3%
 
R1702356.3%
 
I1368835.0%
 
O1368405.0%
 
C1257534.6%
 
T1216224.5%
 
B1119134.1%
 
M1076114.0%
 
E1033173.8%
 
F1011403.7%
 
G954273.5%
 
Z856613.2%
 
U826363.0%
 
P743442.7%
 
D717262.6%
 
H692522.6%
 
J552272.0%
 
K390161.4%
 
V357841.3%
 
Y346681.3%
 
W193620.7%
 
Q130740.5%
 
Other values (8)113460.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
-26669037.1%
 
113906519.3%
 
39254412.9%
 
28591911.9%
 
4629948.8%
 
_365745.1%
 
7165052.3%
 
672501.0%
 
062340.9%
 
543480.6%
 
89060.1%
 
96500.1%
 
15< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3433062100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
A37927611.0%
 
-2666907.8%
 
N1798665.2%
 
L1794575.2%
 
S1719325.0%
 
R1702355.0%
 
11390654.1%
 
I1368834.0%
 
O1368404.0%
 
C1257533.7%
 
T1216223.5%
 
B1119133.3%
 
M1076113.1%
 
E1033173.0%
 
F1011402.9%
 
G954272.8%
 
3925442.7%
 
2859192.5%
 
Z856612.5%
 
U826362.4%
 
P743442.2%
 
D717262.1%
 
H692522.0%
 
4629941.8%
 
J552271.6%
 
Other values (21)2257326.6%
 

data
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct count265204
Unique (%)99.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43462.27619601458
Minimum0.0
Maximum459636600.3531999
Zeros599
Zeros (%)0.2%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile34.58408498
Q11794.753161
median9912.397559
Q335339.98161
95-th percentile197706.4997
Maximum459636600.4
Range459636600.4
Interquartile range (IQR)33545.22845

Descriptive statistics

Standard deviation895186.5912
Coefficient of variation (CV)20.59686398
Kurtosis260518.1124
Mean43462.2762
Median Absolute Deviation (MAD)9514.414659
Skewness507.4377743
Sum1.15909979e+10
Variance8.01359033e+11
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
05990.2%
 
0.000330< 0.1%
 
0.000229< 0.1%
 
0.000417< 0.1%
 
0.00114< 0.1%
 
0.000714< 0.1%
 
0.001514< 0.1%
 
0.001613< 0.1%
 
0.001213< 0.1%
 
0.000113< 0.1%
 
0.000613< 0.1%
 
0.000513< 0.1%
 
0.005711< 0.1%
 
0.00211< 0.1%
 
0.001111< 0.1%
 
0.002211< 0.1%
 
0.003510< 0.1%
 
0.001810< 0.1%
 
0.00389< 0.1%
 
0.00199< 0.1%
 
0.00269< 0.1%
 
0.00139< 0.1%
 
0.00349< 0.1%
 
0.00319< 0.1%
 
0.00149< 0.1%
 
Other values (265179)26578299.7%
 
ValueCountFrequency (%) 
05990.2%
 
8.693003778e-061< 0.1%
 
9.132211955e-061< 0.1%
 
3.162021001e-051< 0.1%
 
5.045368632e-051< 0.1%
 
5.619877416e-051< 0.1%
 
6.511585038e-051< 0.1%
 
0.000113< 0.1%
 
0.00011269428961< 0.1%
 
0.0001378557981< 0.1%
 
ValueCountFrequency (%) 
459636600.41< 0.1%
 
3389193.5971< 0.1%
 
3161778.0711< 0.1%
 
2482581.1371< 0.1%
 
2444452.8221< 0.1%
 
2423076.2281< 0.1%
 
2293691.1891< 0.1%
 
2177162.4361< 0.1%
 
2054240.1851< 0.1%
 
2023448.2711< 0.1%
 

voice
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct count17691
Unique (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1713.1626414089715
Minimum0.0
Maximum4166893.0
Zeros124571
Zeros (%)46.7%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median50
Q31336
95-th percentile8664
Maximum4166893
Range4166893
Interquartile range (IQR)1336

Descriptive statistics

Standard deviation9239.866623
Coefficient of variation (CV)5.393455589
Kurtosis154838.32
Mean1713.162641
Median Absolute Deviation (MAD)50
Skewness344.228712
Sum456885058
Variance85375135.22
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
012457146.7%
 
13720.1%
 
23060.1%
 
42790.1%
 
32670.1%
 
52510.1%
 
62330.1%
 
92230.1%
 
82130.1%
 
72120.1%
 
122070.1%
 
211920.1%
 
111920.1%
 
151910.1%
 
171860.1%
 
141860.1%
 
201850.1%
 
101830.1%
 
161810.1%
 
181810.1%
 
131800.1%
 
251730.1%
 
231680.1%
 
241660.1%
 
221650.1%
 
Other values (17666)13702851.4%
 
ValueCountFrequency (%) 
012457146.7%
 
13720.1%
 
23060.1%
 
32670.1%
 
42790.1%
 
52510.1%
 
62330.1%
 
72120.1%
 
82130.1%
 
92230.1%
 
ValueCountFrequency (%) 
41668931< 0.1%
 
1323741< 0.1%
 
1220071< 0.1%
 
1157061< 0.1%
 
1149321< 0.1%
 
1144311< 0.1%
 
1142461< 0.1%
 
1091701< 0.1%
 
1087071< 0.1%
 
1067311< 0.1%
 

sms
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct count15303
Unique (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1558.3958138819833
Minimum0.0
Maximum7035146.0
Zeros124111
Zeros (%)46.5%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median78
Q31450
95-th percentile7063
Maximum7035146
Range7035146
Interquartile range (IQR)1450

Descriptive statistics

Standard deviation15021.77779
Coefficient of variation (CV)9.63925702
Kurtosis180954.1736
Mean1558.395814
Median Absolute Deviation (MAD)78
Skewness393.2152103
Sum415610138
Variance225653807.9
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
012411146.5%
 
12210.1%
 
21960.1%
 
41730.1%
 
31690.1%
 
51590.1%
 
121570.1%
 
61570.1%
 
81550.1%
 
111540.1%
 
91540.1%
 
71490.1%
 
101490.1%
 
151400.1%
 
191400.1%
 
17132< 0.1%
 
13129< 0.1%
 
20129< 0.1%
 
37129< 0.1%
 
24127< 0.1%
 
21127< 0.1%
 
14126< 0.1%
 
18126< 0.1%
 
47125< 0.1%
 
48125< 0.1%
 
Other values (15278)13903252.1%
 
ValueCountFrequency (%) 
012411146.5%
 
12210.1%
 
21960.1%
 
31690.1%
 
41730.1%
 
51590.1%
 
61570.1%
 
71490.1%
 
81550.1%
 
91540.1%
 
ValueCountFrequency (%) 
70351461< 0.1%
 
15152351< 0.1%
 
14316941< 0.1%
 
5880321< 0.1%
 
5479341< 0.1%
 
5266451< 0.1%
 
4296971< 0.1%
 
4226851< 0.1%
 
3914121< 0.1%
 
3419871< 0.1%
 

revenueData
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct count266013
Unique (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3656.8748307717538
Minimum0.0
Maximum48413063.70517743
Zeros678
Zeros (%)0.3%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile2.065318163
Q1138.9876864
median757.9046195
Q32658.810992
95-th percentile16451.29867
Maximum48413063.71
Range48413063.71
Interquartile range (IQR)2519.823305

Descriptive statistics

Standard deviation94164.12424
Coefficient of variation (CV)25.74988989
Kurtosis261923.9641
Mean3656.874831
Median Absolute Deviation (MAD)727.1004008
Skewness509.488547
Sum975255605.5
Variance8866882294
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
06780.3%
 
4.587505425e-052< 0.1%
 
53.845885861< 0.1%
 
738.49741951< 0.1%
 
1828.7770861< 0.1%
 
24.333770881< 0.1%
 
80.246483211< 0.1%
 
17402.740981< 0.1%
 
89.317463651< 0.1%
 
1960.1354881< 0.1%
 
31730.6621< 0.1%
 
2428.8277881< 0.1%
 
2449.7729541< 0.1%
 
492.39242461< 0.1%
 
73.13816791< 0.1%
 
411.56591281< 0.1%
 
11.943330331< 0.1%
 
241.23277271< 0.1%
 
5120.7195471< 0.1%
 
64.606819131< 0.1%
 
416.900391< 0.1%
 
362.362441< 0.1%
 
8069.6811711< 0.1%
 
6585.8808161< 0.1%
 
0.072972581241< 0.1%
 
Other values (265988)26598899.7%
 
ValueCountFrequency (%) 
06780.3%
 
1.99779855e-071< 0.1%
 
4.796649247e-071< 0.1%
 
5.279659965e-071< 0.1%
 
6.984846579e-071< 0.1%
 
7.661235011e-071< 0.1%
 
7.721764916e-071< 0.1%
 
1.101797357e-061< 0.1%
 
1.1124283e-061< 0.1%
 
1.469063142e-061< 0.1%
 
ValueCountFrequency (%) 
48413063.711< 0.1%
 
374550.20611< 0.1%
 
324289.36031< 0.1%
 
260554.0381< 0.1%
 
234407.74311< 0.1%
 
231737.42041< 0.1%
 
225302.69241< 0.1%
 
210042.74541< 0.1%
 
199884.78521< 0.1%
 
197643.95371< 0.1%
 

revenueVoice
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct count142036
Unique (%)53.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2798.05934006934
Minimum0.0
Maximum7304538.566859882
Zeros124643
Zeros (%)46.7%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median63.09816916
Q31850.96798
95-th percentile14619.64839
Maximum7304538.567
Range7304538.567
Interquartile range (IQR)1850.96798

Descriptive statistics

Standard deviation16166.6324
Coefficient of variation (CV)5.77780184
Kurtosis156039.1861
Mean2798.05934
Median Absolute Deviation (MAD)63.09816916
Skewness346.2313916
Sum746217243.5
Variance261360003.2
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
012464346.7%
 
0.55238708382< 0.1%
 
3.6794820822< 0.1%
 
2.4097908372< 0.1%
 
1.2827265742< 0.1%
 
0.65666835222< 0.1%
 
0.41317774582< 0.1%
 
0.67690916892< 0.1%
 
1.434664352< 0.1%
 
5.136085182< 0.1%
 
2.2995025012< 0.1%
 
1.9186196992< 0.1%
 
0.78044497792< 0.1%
 
3.744188242< 0.1%
 
17957.878271< 0.1%
 
192.57577981< 0.1%
 
260.03894571< 0.1%
 
438.17931871< 0.1%
 
407.31965991< 0.1%
 
1901.6733181< 0.1%
 
343.75213461< 0.1%
 
850.39615161< 0.1%
 
2452.4611821< 0.1%
 
710.76916621< 0.1%
 
5497.2305431< 0.1%
 
Other values (142011)14201153.2%
 
ValueCountFrequency (%) 
012464346.7%
 
0.060657640821< 0.1%
 
0.064412593841< 0.1%
 
0.071777615331< 0.1%
 
0.11279641371< 0.1%
 
0.13147880381< 0.1%
 
0.13208224751< 0.1%
 
0.13672020981< 0.1%
 
0.14962949821< 0.1%
 
0.16186411611< 0.1%
 
ValueCountFrequency (%) 
7304538.5671< 0.1%
 
272198.2411< 0.1%
 
207280.54711< 0.1%
 
202273.87551< 0.1%
 
197374.00081< 0.1%
 
185707.22211< 0.1%
 
184748.40151< 0.1%
 
175923.98071< 0.1%
 
175365.1961< 0.1%
 
174553.41581< 0.1%
 

revenueSms
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct count142514
Unique (%)53.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1549.7010156578885
Minimum0.0
Maximum7398027.683505454
Zeros124168
Zeros (%)46.6%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median71.09182077
Q31500.724922
95-th percentile7218.960124
Maximum7398027.684
Range7398027.684
Interquartile range (IQR)1500.724922

Descriptive statistics

Standard deviation14805.14409
Coefficient of variation (CV)9.553548681
Kurtosis233586.3882
Mean1549.701016
Median Absolute Deviation (MAD)71.09182077
Skewness467.6719718
Sum413291313.6
Variance219192291.6
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
012416846.6%
 
5.9757500292< 0.1%
 
16.61596362< 0.1%
 
1.7596653012< 0.1%
 
0.28817374992< 0.1%
 
0.640418992< 0.1%
 
0.64154159152< 0.1%
 
0.047472498242< 0.1%
 
6.4389914472< 0.1%
 
2.5448386212< 0.1%
 
2.7571498842< 0.1%
 
683.05301361< 0.1%
 
1069.7819671< 0.1%
 
83.736843451< 0.1%
 
2750.5367291< 0.1%
 
1787.5508761< 0.1%
 
444.6699351< 0.1%
 
2767.4394011< 0.1%
 
7713.0851091< 0.1%
 
1146.3236551< 0.1%
 
2378.8461131< 0.1%
 
615.04698441< 0.1%
 
922.69368861< 0.1%
 
2940.9299071< 0.1%
 
741.64307381< 0.1%
 
Other values (142489)14248953.4%
 
ValueCountFrequency (%) 
012416846.6%
 
0.00028469583651< 0.1%
 
0.00043000617191< 0.1%
 
0.00080290936941< 0.1%
 
0.0033653252951< 0.1%
 
0.0033839800671< 0.1%
 
0.0039197567911< 0.1%
 
0.004476762771< 0.1%
 
0.004747599121< 0.1%
 
0.0052731754541< 0.1%
 
ValueCountFrequency (%) 
7398027.6841< 0.1%
 
123441.77351< 0.1%
 
118153.80891< 0.1%
 
109397.94061< 0.1%
 
107507.14011< 0.1%
 
105932.48921< 0.1%
 
103742.10081< 0.1%
 
99919.964591< 0.1%
 
96590.565291< 0.1%
 
95536.833391< 0.1%
 

yieldData
Real number (ℝ≥0)

SKEWED

Distinct count265951
Unique (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08860814573027018
Minimum0.0
Maximum577.1884074485318
Zeros678
Zeros (%)0.3%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile0.02897523099
Q10.06115838414
median0.08115910213
Q30.1006106922
95-th percentile0.1420108304
Maximum577.1884074
Range577.1884074
Interquartile range (IQR)0.03945230804

Descriptive statistics

Standard deviation1.129661413
Coefficient of variation (CV)12.74895669
Kurtosis255405.4317
Mean0.08860814573
Median Absolute Deviation (MAD)0.01971499979
Skewness500.3643936
Sum23630.99499
Variance1.276134907
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
06780.3%
 
0.012101079374< 0.1%
 
0.0038608824584< 0.1%
 
0.0038608824583< 0.1%
 
0.024514948933< 0.1%
 
0.065430125342< 0.1%
 
0.080422140542< 0.1%
 
0.012980690442< 0.1%
 
1.2158981462< 0.1%
 
0.14918549462< 0.1%
 
0.017411612182< 0.1%
 
0.084163870892< 0.1%
 
0.035776145472< 0.1%
 
0.041704594772< 0.1%
 
0.018539996262< 0.1%
 
0.015972987622< 0.1%
 
0.014619060842< 0.1%
 
0.13301758082< 0.1%
 
0.003889155832< 0.1%
 
0.16026442692< 0.1%
 
0.066541240342< 0.1%
 
0.099490590292< 0.1%
 
0.53061014922< 0.1%
 
0.14638746872< 0.1%
 
0.0072426199262< 0.1%
 
Other values (265926)26595999.7%
 
ValueCountFrequency (%) 
06780.3%
 
4.461660903e-081< 0.1%
 
2.18388152e-071< 0.1%
 
2.189515125e-071< 0.1%
 
5.165946026e-071< 0.1%
 
6.986375115e-071< 0.1%
 
1.015945527e-061< 0.1%
 
1.580358449e-061< 0.1%
 
1.84733864e-061< 0.1%
 
4.24829652e-061< 0.1%
 
ValueCountFrequency (%) 
577.18840741< 0.1%
 
41.915861821< 0.1%
 
34.972009781< 0.1%
 
25.243224281< 0.1%
 
22.011401591< 0.1%
 
20.391133271< 0.1%
 
15.104310851< 0.1%
 
12.656438581< 0.1%
 
12.604133451< 0.1%
 
11.597754431< 0.1%
 

yieldVoice
Real number (ℝ≥0)

ZEROS

Distinct count141993
Unique (%)53.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8226020813040287
Minimum0.0
Maximum131.4702993387589
Zeros124643
Zeros (%)46.7%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.7350170214
Q31.540280851
95-th percentile2.207640722
Maximum131.4702993
Range131.4702993
Interquartile range (IQR)1.540280851

Descriptive statistics

Standard deviation0.9871049906
Coefficient of variation (CV)1.199978718
Kurtosis1647.686069
Mean0.8226020813
Median Absolute Deviation (MAD)0.7350170214
Skewness15.63054775
Sum219380.5717
Variance0.9743762624
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
012464346.7%
 
3.0643674393< 0.1%
 
0.41317774583< 0.1%
 
0.69601856633< 0.1%
 
0.93604706013< 0.1%
 
3.6794820822< 0.1%
 
0.700741492< 0.1%
 
0.57286993822< 0.1%
 
10.236902792< 0.1%
 
0.55238708382< 0.1%
 
0.62463070192< 0.1%
 
1.8704241862< 0.1%
 
5.136085182< 0.1%
 
1.3391998342< 0.1%
 
0.81481640812< 0.1%
 
0.29933988792< 0.1%
 
1.6830377062< 0.1%
 
0.85306564132< 0.1%
 
5.973906962< 0.1%
 
0.76991144522< 0.1%
 
0.23550972672< 0.1%
 
1.0983139022< 0.1%
 
0.65399215392< 0.1%
 
1.2450681822< 0.1%
 
23.873654072< 0.1%
 
Other values (141968)14199653.2%
 
ValueCountFrequency (%) 
012464346.7%
 
0.013208224751< 0.1%
 
0.015056585231< 0.1%
 
0.021470864611< 0.1%
 
0.026206775521< 0.1%
 
0.028042984591< 0.1%
 
0.02964331051< 0.1%
 
0.032887474691< 0.1%
 
0.034171036551< 0.1%
 
0.035657168961< 0.1%
 
ValueCountFrequency (%) 
131.47029931< 0.1%
 
102.84416231< 0.1%
 
45.739216831< 0.1%
 
41.861258461< 0.1%
 
39.507529471< 0.1%
 
39.490737651< 0.1%
 
38.133074461< 0.1%
 
29.928170321< 0.1%
 
23.873654072< 0.1%
 
23.344237971< 0.1%
 

yieldSms
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct count142497
Unique (%)53.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6324937135912497
Minimum0.0
Maximum158.3961530856573
Zeros124168
Zeros (%)46.6%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.4210379211
Q31.100916454
95-th percentile1.86326396
Maximum158.3961531
Range158.3961531
Interquartile range (IQR)1.100916454

Descriptive statistics

Standard deviation1.013739341
Coefficient of variation (CV)1.602765876
Kurtosis3719.927725
Mean0.6324937136
Median Absolute Deviation (MAD)0.4210379211
Skewness33.91918453
Sum168680.381
Variance1.027667452
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
012416846.6%
 
0.85674194733< 0.1%
 
3.751672343< 0.1%
 
0.1440868753< 0.1%
 
0.0052731754542< 0.1%
 
0.60967991582< 0.1%
 
0.089474596422< 0.1%
 
0.0019090694972< 0.1%
 
0.0088771464772< 0.1%
 
1.6097478622< 0.1%
 
2.1508323832< 0.1%
 
0.51794779662< 0.1%
 
0.093872766992< 0.1%
 
1.2724193112< 0.1%
 
0.32077079582< 0.1%
 
0.87861913822< 0.1%
 
1.7605365552< 0.1%
 
19.402777782< 0.1%
 
1.7596653012< 0.1%
 
1.0520787362< 0.1%
 
0.044521019892< 0.1%
 
2.6605909952< 0.1%
 
0.59763596242< 0.1%
 
0.60395528172< 0.1%
 
0.047472498242< 0.1%
 
Other values (142472)14247253.4%
 
ValueCountFrequency (%) 
012416846.6%
 
0.0002150030861< 0.1%
 
0.00028469583651< 0.1%
 
0.00040145468471< 0.1%
 
0.0011613398251< 0.1%
 
0.001535243831< 0.1%
 
0.0017420097371< 0.1%
 
0.0018938422281< 0.1%
 
0.0019090694972< 0.1%
 
0.0019598783951< 0.1%
 
ValueCountFrequency (%) 
158.39615311< 0.1%
 
121.48073071< 0.1%
 
87.857035211< 0.1%
 
85.505831271< 0.1%
 
77.998358851< 0.1%
 
61.081230021< 0.1%
 
61.081230021< 0.1%
 
49.854818681< 0.1%
 
47.745440361< 0.1%
 
44.114032661< 0.1%
 

brand
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
Globe Postpaid
266691
ValueCountFrequency (%) 
Globe Postpaid266691100.0%
 

Length

Max length14
Median length14
Mean length14
Min length14

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o53338214.3%
 
G2666917.1%
 
l2666917.1%
 
b2666917.1%
 
e2666917.1%
 
2666917.1%
 
P2666917.1%
 
s2666917.1%
 
t2666917.1%
 
p2666917.1%
 
a2666917.1%
 
i2666917.1%
 
d2666917.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter293360178.6%
 
Uppercase Letter53338214.3%
 
Space Separator2666917.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
G26669150.0%
 
P26669150.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o53338218.2%
 
l2666919.1%
 
b2666919.1%
 
e2666919.1%
 
s2666919.1%
 
t2666919.1%
 
p2666919.1%
 
a2666919.1%
 
i2666919.1%
 
d2666919.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
266691100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin346698392.9%
 
Common2666917.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o53338215.4%
 
G2666917.7%
 
l2666917.7%
 
b2666917.7%
 
e2666917.7%
 
P2666917.7%
 
s2666917.7%
 
t2666917.7%
 
p2666917.7%
 
a2666917.7%
 
i2666917.7%
 
d2666917.7%
 

Most frequent Common characters

ValueCountFrequency (%) 
266691100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3733674100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o53338214.3%
 
G2666917.1%
 
l2666917.1%
 
b2666917.1%
 
e2666917.1%
 
2666917.1%
 
P2666917.1%
 
s2666917.1%
 
t2666917.1%
 
p2666917.1%
 
a2666917.1%
 
i2666917.1%
 
d2666917.1%
 

unitId
Categorical

HIGH CARDINALITY
MISSING

Distinct count9534
Unique (%)3.8%
Missing13198
Missing (%)4.9%
Memory size2.0 MiB
FCICTMALL
 
65
PEARLPLAZA2PQUENCRID
 
64
DAGUPAN3
 
60
LIGAO
 
60
UNISITE
 
60
Other values (9529)
253184
ValueCountFrequency (%) 
FCICTMALL65< 0.1%
 
PEARLPLAZA2PQUENCRID64< 0.1%
 
DAGUPAN360< 0.1%
 
LIGAO60< 0.1%
 
UNISITE60< 0.1%
 
DELROSAR60< 0.1%
 
TANDAKUTYOTANAYRZL60< 0.1%
 
PRINZA60< 0.1%
 
ANGELWEST60< 0.1%
 
MAGSAYSAY59< 0.1%
 
CALPN259< 0.1%
 
BLUMINTRE59< 0.1%
 
SMHYPERM59< 0.1%
 
SANNICHOL59< 0.1%
 
18EIGHTY59< 0.1%
 
BACOOR259< 0.1%
 
BTNGS259< 0.1%
 
ELPUEBLO158< 0.1%
 
MATIMBMAL58< 0.1%
 
CARME258< 0.1%
 
CENTEXML57< 0.1%
 
SFPAGUSTN57< 0.1%
 
CARGOHAUS57< 0.1%
 
BAGUIO57< 0.1%
 
CANERO57< 0.1%
 
Other values (9509)25201394.5%
 
(Missing)131984.9%
 

Length

Max length25
Median length8
Mean length8.280455658
Min length2

Overview of Unicode Properties

Unique unicode characters43
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
A34169315.5%
 
N1687797.6%
 
L1444926.5%
 
S1287005.8%
 
O1284545.8%
 
I1272865.8%
 
T1139295.2%
 
R1098025.0%
 
C1020634.6%
 
M998954.5%
 
E954984.3%
 
B900094.1%
 
G886464.0%
 
U783833.5%
 
P702213.2%
 
D654683.0%
 
Y327011.5%
 
V301941.4%
 
n263991.2%
 
H249101.1%
 
K226901.0%
 
W180100.8%
 
2161790.7%
 
Z144410.7%
 
a132010.6%
 
Other values (18)562802.5%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter213419996.6%
 
Lowercase Letter396211.8%
 
Decimal Number345031.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A34169316.0%
 
N1687797.9%
 
L1444926.8%
 
S1287006.0%
 
O1284546.0%
 
I1272866.0%
 
T1139295.3%
 
R1098025.1%
 
C1020634.8%
 
M998954.7%
 
E954984.5%
 
B900094.2%
 
G886464.2%
 
U783833.7%
 
P702213.3%
 
D654683.1%
 
Y327011.5%
 
V301941.4%
 
H249101.2%
 
K226901.1%
 
W180100.8%
 
Z144410.7%
 
F120000.6%
 
Q119210.6%
 
J101960.5%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
21617946.9%
 
1774622.5%
 
3434112.6%
 
417715.1%
 
511513.3%
 
87992.3%
 
67332.1%
 
07332.1%
 
95551.6%
 
74951.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n2639966.6%
 
a1320133.3%
 
s6< 0.1%
 
l6< 0.1%
 
c3< 0.1%
 
r3< 0.1%
 
o3< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin217382098.4%
 
Common345031.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A34169315.7%
 
N1687797.8%
 
L1444926.6%
 
S1287005.9%
 
O1284545.9%
 
I1272865.9%
 
T1139295.2%
 
R1098025.1%
 
C1020634.7%
 
M998954.6%
 
E954984.4%
 
B900094.1%
 
G886464.1%
 
U783833.6%
 
P702213.2%
 
D654683.0%
 
Y327011.5%
 
V301941.4%
 
n263991.2%
 
H249101.1%
 
K226901.0%
 
W180100.8%
 
Z144410.7%
 
a132010.6%
 
F120000.6%
 
Other values (8)259561.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
21617946.9%
 
1774622.5%
 
3434112.6%
 
417715.1%
 
511513.3%
 
87992.3%
 
67332.1%
 
07332.1%
 
95551.6%
 
74951.4%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2208323100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
A34169315.5%
 
N1687797.6%
 
L1444926.5%
 
S1287005.8%
 
O1284545.8%
 
I1272865.8%
 
T1139295.2%
 
R1098025.0%
 
C1020634.6%
 
M998954.5%
 
E954984.3%
 
B900094.1%
 
G886464.0%
 
U783833.5%
 
P702213.2%
 
D654683.0%
 
Y327011.5%
 
V301941.4%
 
n263991.2%
 
H249101.1%
 
K226901.0%
 
W180100.8%
 
2161790.7%
 
Z144410.7%
 
a132010.6%
 
Other values (18)562802.5%
 

dataScaleFactor
Real number (ℝ≥0)

SKEWED

Distinct count213954
Unique (%)80.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8787710785094583
Minimum0.00018483832807570977
Maximum8125.818219866071
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB

Quantile statistics

Minimum0.0001848383281
5-th percentile0.2350499094
Q10.4974800131
median0.7212358021
Q31
95-th percentile1.694816613
Maximum8125.81822
Range8125.818035
Interquartile range (IQR)0.5025199869

Descriptive statistics

Standard deviation15.8480096
Coefficient of variation (CV)18.03428673
Kurtosis259056.1929
Mean0.8787710785
Median Absolute Deviation (MAD)0.2787641979
Skewness505.5685903
Sum234360.3377
Variance251.1594082
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
15273819.8%
 
0.69428405991< 0.1%
 
0.57544986461< 0.1%
 
0.45131850971< 0.1%
 
0.812351741< 0.1%
 
0.37802398541< 0.1%
 
1.1736057031< 0.1%
 
1.0525649561< 0.1%
 
0.49168247611< 0.1%
 
0.68707044851< 0.1%
 
1.0216521111< 0.1%
 
0.35551879721< 0.1%
 
0.53328953411< 0.1%
 
0.50301715731< 0.1%
 
0.54134495571< 0.1%
 
1.522872311< 0.1%
 
0.22690622511< 0.1%
 
0.50019808671< 0.1%
 
0.22823505441< 0.1%
 
0.6516615211< 0.1%
 
0.99830520671< 0.1%
 
0.96184228131< 0.1%
 
0.22888012581< 0.1%
 
1.4347945271< 0.1%
 
0.65383582441< 0.1%
 
Other values (213929)21392980.2%
 
ValueCountFrequency (%) 
0.00018483832811< 0.1%
 
0.00022133243091< 0.1%
 
0.00025810695691< 0.1%
 
0.0002954054971< 0.1%
 
0.0003004717231< 0.1%
 
0.00034911392021< 0.1%
 
0.00057444852941< 0.1%
 
0.00097116712711< 0.1%
 
0.0011618717191< 0.1%
 
0.0013966699271< 0.1%
 
ValueCountFrequency (%) 
8125.818221< 0.1%
 
569.56174161< 0.1%
 
408.45093711< 0.1%
 
304.41757811< 0.1%
 
183.04419831< 0.1%
 
172.60162961< 0.1%
 
129.33102911< 0.1%
 
110.53248051< 0.1%
 
110.00460841< 0.1%
 
103.80201661< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

cellIddatavoicesmsrevenueDatarevenueVoicerevenueSmsyieldDatayieldVoiceyieldSmsbrandunitIddataScaleFactor
0ROBIROSAF-188016.3695110.00.013146.8409490.0000000.0000000.1493680.0000000.000000Globe PostpaidROBIROSA0.864947
1CABARUANPGSNH-431058.9250830.00.095.9384490.0000000.0000000.0906000.0000000.000000Globe PostpaidCABARUANPGSN0.503747
2UPTWNBPO2BGC-31606.904400391.01155.0139.672560514.3483721935.6733650.0869201.3154691.675908Globe PostpaidUPTWNBPO2BGC1.000000
3CALASIAO6Z-LA_4RFS919.7742132718.0256.032.5623131441.40406195.6818450.0354030.5303180.373757Globe PostpaidCALASIAO60.833502
4PENARANDAZ-B20432.6048067121.03054.01464.65252612750.7529442681.3526300.0716821.7905850.877981Globe PostpaidPENARANDA0.729193
5RCARASJ-23308.04846380.01374.0230.192187131.639278557.7213880.0695851.6454910.405911Globe PostpaidRCARAS0.466188
6MANIPONSTBAMBANTAREMF-12179.8144000.00.014.2246090.0000000.0000000.0791070.0000000.000000Globe PostpaidNaN1.000000
7SPLASHVALODZ-L1_4RFS7499.0261161412.0618.0420.3469672020.7742161077.6684900.0560541.4311431.743800Globe PostpaidSPLASHVALOD0.695699
8ROBINSON1MLAZ-3_4RFS17790.02756113820.05763.02163.18098528888.7498627480.0546880.1215952.0903581.297945Globe PostpaidROBINSON1MLA3.017338
9BALINGASAF-13341.1022340.00.024.8397360.0000000.0000000.0728220.0000000.000000Globe PostpaidBALINGASA0.677181

Last rows

cellIddatavoicesmsrevenueDatarevenueVoicerevenueSmsyieldDatayieldVoiceyieldSmsbrandunitIddataScaleFactor
266681MARGRNVILPRQK-1433263.9877140.00.070.1916410.0000000.0000000.0215050.0000000.000000Globe PostpaidMARGRNVILPRQ0.277953
266682IBARRAH-41127571.9516090.00.010598.5606730.0000000.0000000.0830790.0000000.000000Globe PostpaidIBARRA0.408420
266683SJBULACANX-10.0000001.035.00.0000001.90002211.3770790.0000001.9000220.325059Globe PostpaidSJBULACAN1.000000
266684CENACLEH-3229494.9488180.00.02138.1783370.0000000.0000000.0724930.0000000.000000Globe PostpaidCENACLE0.279792
266685DISTRICTNPTF-1125823.0530200.00.02165.1032920.0000000.0000000.0838440.0000000.000000Globe PostpaidDISTRICTNPT0.114151
266686BLIWAGESTF-10111954.6281950.00.0827.3521050.0000000.0000000.0692080.0000000.000000Globe PostpaidBLIWAGEST1.919972
266687BONISERVJ-D4168.058614374.02295.0461.776911569.424758261.2520730.1107891.5225260.113835Globe PostpaidBONISERV1.097294
266688TLISAH-329764.5347950.00.0919.8509210.0000000.0000000.0942030.0000000.000000Globe PostpaidTLISA0.751377
266689CLAVR2-2988.0007002240.04081.0103.5530952937.0903993845.3667120.1048111.3112010.942261Globe PostpaidCLAVR21.000000
266690LUNACAF-13209.4426710.00.012.8618450.0000000.0000000.0614100.0000000.000000Globe PostpaidLUNACA0.261211